Skip to content

feat(airweave): add airweave block#3079

Open
EwanTauran wants to merge 4 commits intosimstudioai:stagingfrom
EwanTauran:feat/block/airweave-integration
Open

feat(airweave): add airweave block#3079
EwanTauran wants to merge 4 commits intosimstudioai:stagingfrom
EwanTauran:feat/block/airweave-integration

Conversation

@EwanTauran
Copy link

@EwanTauran EwanTauran commented Jan 30, 2026

Summary

Adds Airweave integration to Sim, enabling agents to search across 30+ connected data sources including Stripe, GitHub, Notion, Slack, HubSpot, Zendesk, and more through a unified semantic search API.

Airweave makes any app searchable by syncing data from various sources with minimal configuration. This integration allows Sim workflows to query internal company data, customer information, and business metrics from all connected sources in a single search.

What is Airweave?

Airweave is an open-source platform that provides unified search across multiple business applications. It:

  • Connects to 30+ data sources: Stripe, GitHub, Notion, Slack, HubSpot, Zendesk, Linear, Jira, and more
  • Syncs data automatically: Keeps internal knowledge up-to-date with incremental updates
  • Semantic & keyword search: Uses vector embeddings for intelligent search with multiple retrieval strategies
  • Multi-tenant architecture: Supports OAuth2 and API key authentication
  • AI-powered answers: Can generate natural language answers from search results

Why This Integration Matters

Currently, Sim workflows that need to access company data require:

  • Multiple tool blocks (one per data source)
  • Complex workflow logic to aggregate results
  • Separate API keys and configurations for each service

With Airweave, agents can:

  • Search across all data sources with a single query
  • Choose retrieval strategies (hybrid, neural, or keyword) for optimal results
  • Get AI-generated summaries with the "Generate Answer" feature
  • Improve relevance with LLM-powered reranking

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Other: ___________

Implementation Details

Files Added

Tools (tools/airweave/):

File Description
types.ts TypeScript type definitions for Airweave API
search.ts Search tool implementation with ToolConfig
index.ts Barrel exports

Block:

File Description
blocks/blocks/airweave.ts Block configuration with UI elements

Files Modified

File Change
components/icons.tsx Added AirweaveIcon
tools/registry.ts Registered airweave_search tool
blocks/registry.ts Registered airweave block

Documentation Generated

File Description
apps/docs/content/docs/en/tools/airweave.mdx Auto-generated tool documentation
apps/docs/components/icons.tsx Added AirweaveIcon to docs
apps/docs/components/ui/icon-mapping.ts Added icon mapping
apps/docs/content/docs/en/tools/meta.json Added to tools list

Tool Configuration

Property Value
Tool ID airweave_search
Block Type airweave
Category Tools
Authentication API Key

Parameters

Parameter Type Required Visibility Description
collectionId string Yes user-or-llm The Airweave collection readable ID to search
query string Yes user-or-llm Search query text
apiKey string Yes user-only Airweave API key
limit number No user-only Maximum results (10, 25, 50, or 100)
retrievalStrategy string No user-only Search strategy: hybrid, neural, or keyword
expandQuery boolean No user-only Generate query variations to improve recall
rerank boolean No user-only Reorder results using LLM for better relevance
generateAnswer boolean No user-only Generate AI-powered answer from results

Outputs

Output Type Description
results array Search results with entity_id, source_name, md_content, score, metadata, breadcrumbs, url
completion string AI-generated answer (when generateAnswer is enabled)

Block Features

  • Visual Design: Indigo background (#6366F1) with Airweave logo
  • Retrieval Strategies: Dropdown to choose hybrid, neural, or keyword search
  • Query Expansion: Toggle to generate query variations for better recall
  • LLM Reranking: Toggle to reorder results by relevance
  • AI Answers: Toggle to generate natural language answers from results
  • Error Handling: Clear feedback for API errors and empty results

Usage Examples

As Standalone Block

Search for customer feedback across all connected platforms:

blocks:
  - id: airweave1
    type: airweave
    params:
      collectionId: "customer-data"
      query: "complaints about billing in the last week"
      retrievalStrategy: "hybrid"
      limit: 25
      rerank: true
      apiKey: <env.AIRWEAVE_API_KEY>

As Agent Tool with AI Answers

Enable agents to get summarized answers from company knowledge:

blocks:
  - id: agent1
    type: agent
    params:
      model: "openai/gpt-4o"
      systemPrompt: "You are a customer support agent with access to all company data."
      tools:
        - type: airweave
          params:
            collectionId: "company-knowledge"
            generateAnswer: true
            apiKey: <env.AIRWEAVE_API_KEY>

Setup Requirements

  1. Create Airweave account at https://app.airweave.ai
  2. Get API key from Airweave dashboard
  3. Create collection and connect data sources (Stripe, Slack, Notion, etc.)
  4. Copy collection readable ID from Airweave
  5. Use in Sim with the Airweave block or as an agent tool

Testing

  • Tool and block configuration verified against Airweave API documentation
  • Generated tool documentation using generate-docs script
  • Block appears in toolbar under "tools" category
  • Zero linter errors
  • Follows SimStudio naming conventions
image

Reviewers should verify:

  • Tool parameters match Airweave API contract
  • Block UI renders correctly with all subBlocks

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Related Resources

Breaking Changes

None - this is a new integration with no impact on existing functionality.

@vercel
Copy link

vercel bot commented Jan 30, 2026

@EwanTauran is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

@EwanTauran
Copy link
Author

running into some slight issues testing, when i run Sim locally I can't create blocks (not airweave, not any other block for that matter). so I can't test the integration and move forward. would be great if someone could help me out with this.

@waleedlatif1
Copy link
Collaborator

@EwanTauran is the socket server running? you have to run bun run dev:full so things persist

@EwanTauran EwanTauran marked this pull request as ready for review February 2, 2026 12:44
@EwanTauran
Copy link
Author

EwanTauran commented Feb 2, 2026

that helped, thanks @waleedlatif1.

everythings working I think we're ready to merge

@EwanTauran EwanTauran changed the base branch from main to staging February 2, 2026 20:25
@EwanTauran
Copy link
Author

@emir-karabeg
Copy link
Collaborator

@greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 5, 2026

Greptile Overview

Greptile Summary

Added Airweave integration that enables unified semantic search across 30+ connected data sources (Stripe, GitHub, Notion, Slack, HubSpot, Zendesk, etc.) through a single API.

Implementation:

  • Created airweave_search tool with proper parameter handling for search configuration (retrieval strategy, query expansion, LLM reranking, answer generation)
  • Added comprehensive type definitions for request/response with detailed output properties
  • Implemented block configuration with UI controls for all search options
  • Registered tool and block in respective registries
  • Auto-generated documentation

Architecture alignment:

  • Follows single-tool block pattern (like Serper, Vision) with tools.access array - no tools.config needed since serializer automatically selects access[0]
  • Uses user-only visibility for apiKey parameter per custom instruction #2851870a
  • Proper error handling with fallback to empty results
  • Response transformation maps API response to standardized output schema

Key features:

  • Multiple retrieval strategies (hybrid, neural, keyword)
  • Optional query expansion for improved recall
  • LLM-powered reranking for relevance
  • AI-generated answers from search results

No issues found. The implementation is clean, well-structured, and follows all established patterns.

Confidence Score: 5/5

  • This PR is safe to merge with no issues found
  • Well-structured new feature implementation following established patterns. All files properly implement the Airweave integration with correct type definitions, parameter handling, and response transformation. The block configuration correctly uses a single-tool pattern consistent with similar blocks in the codebase.
  • No files require special attention

Important Files Changed

Filename Overview
apps/sim/tools/airweave/search.ts New tool implementation for Airweave search with proper parameter handling and response transformation
apps/sim/tools/airweave/types.ts Type definitions for Airweave search with comprehensive output properties
apps/sim/blocks/blocks/airweave.ts Block configuration missing tools.config for dynamic tool selection - only has tools.access

Sequence Diagram

sequenceDiagram
    participant User
    participant SimWorkflow
    participant AirweaveBlock
    participant AirweaveTool
    participant AirweaveAPI

    User->>SimWorkflow: Configure Airweave block with collectionId, query, apiKey
    SimWorkflow->>AirweaveBlock: Serialize block config
    AirweaveBlock->>AirweaveBlock: Select tool from access array [0]
    AirweaveBlock-->>SimWorkflow: tool: "airweave_search"
    
    SimWorkflow->>AirweaveTool: Execute airweave_search with params
    AirweaveTool->>AirweaveTool: Build request body
    Note over AirweaveTool: Include optional params:<br/>limit, retrievalStrategy,<br/>expandQuery, rerank,<br/>generateAnswer
    
    AirweaveTool->>AirweaveAPI: POST /collections/{collectionId}/search
    Note over AirweaveAPI: X-API-Key header<br/>Content-Type: application/json
    
    AirweaveAPI-->>AirweaveTool: Search results + optional completion
    AirweaveTool->>AirweaveTool: Transform response
    Note over AirweaveTool: Map results to output schema<br/>with entity_id, source_name,<br/>md_content, score, etc.
    
    AirweaveTool-->>SimWorkflow: {success: true, output: {results, completion?}}
    SimWorkflow-->>User: Display search results
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants